BBW: a batch balance wrapper for training deep neural networks on extremely imbalanced datasets with few minority samples
نویسندگان
چکیده
Abstract In recent years, Deep Neural Networks (DNNs) have achieved excellent performance on many tasks, but it is very difficult to train good models from imbalanced datasets. Creating balanced batches either by majority data down-sampling or minority up-sampling can solve the problem in certain cases. However, may lead learning process instability and overfitting. this paper, we propose Batch Balance Wrapper (BBW), a novel framework which adapt general DNN be well trained extremely datasets with few samples. BBW, two extra network layers are added start of DNN. The prevent overfitting samples improve expressiveness sample distribution Furthermore, (BB), class-based sampling algorithm, proposed make sure each batch always during process. We test BBW three well-known maximum imbalance ratio reaches 1167:1 only 16 positive Compared existing approaches, achieves better classification performance. addition, BBW-wrapped DNNs 16.39 times faster, relative unwrapped DNNs. Moreover, does not require preprocessing additional hyper-parameter tuning, operations that processing time. experiments prove applied common applications samples, such as EEG signals, medical images so on.
منابع مشابه
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Training deep neural networks with Stochastic Gradient Descent, or its variants, requires careful choice of both learning rate and batch size. While smaller batch sizes generally converge in fewer training epochs, larger batch sizes offer more parallelism and hence better computational efficiency. We have developed a new training approach that, rather than statically choosing a single batch siz...
متن کاملA Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets
Most investigations into near-memory hardware accelerators for deep neural networks have primarily focused on inference, while the potential of accelerating training has received relatively little attention so far. Based on an in-depth analysis of the key computational patterns in state-of-the-art gradient-based training methods, we propose an efficient near-memory acceleration engine called NT...
متن کاملTraining neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance
This study investigates the effect of class imbalance in training data when developing neural network classifiers for computer-aided medical diagnosis. The investigation is performed in the presence of other characteristics that are typical among medical data, namely small training sample size, large number of features, and correlations between features. Two methods of neural network training a...
متن کاملNeural Voice Cloning with a Few Samples
Voice cloning is a highly desired feature for personalized speech interfaces. Neural network based speech synthesis has been shown to generate high quality speech for a large number of speakers. In this paper, we introduce a neural voice cloning system that takes a few audio samples as input. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-t...
متن کاملBatch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches
As an indispensable component, Batch Normalization (BN) has successfully improved the training of deep neural networks (DNNs) with mini-batches, by normalizing the distribution of the internal representation for each hidden layer. However, the effectiveness of BN would diminish with scenario of micro-batch (e.g. less than 10 samples in a mini-batch), since the estimated statistics in a mini-bat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied Intelligence
سال: 2021
ISSN: ['0924-669X', '1573-7497']
DOI: https://doi.org/10.1007/s10489-021-02623-9